7-competitive-networks-cover

Introduction

In the first section, we introduced the Artificial Neuron and demonstrated how it could be deployed to recognise patterns. In this lesson, we shall look at a different type of network called the Competitive Network. As is common in the ANN field, there are various iterations of this concept and its relations may also be known as ‘Kohonen’, ‘Winner Takes All’ or ‘Self Organising’ networks. Their key characterising function is that they are good at identifying patterns in data even when the programmer or subject area experts are not aware of the nature of these patterns. A competitive network is trained by detecting subtle patterns in data and then exaggerating those patterns in successive training cycles. After a period of training, those patterns containing a meaningful characterising structure become more pronounced and thus clearly identifiable by the ANN.

Basic mechanism of a competitive network

As noted, a competitive network is trained by detecting subtle patterns in data and then exaggerating those patterns in successive training cycles. Consider the following two input / three neuron network:

2 input / 3 neuron ANN

Figure 1: 2 input / 3 neuron ANN

Prior to training, it can be assumed that the weights between the inputs and the neurons are initialised as random numbers.

Once a pattern is applied to the input, the data is propagated through the network in the normal manner and the output is calculated. Given the random allocation of weights at this stage, one of the neurons will, by chance, have a higher output than the others: this neuron is deemed to have ‘won’. To train the network, we assign the winning output to 1 and all others to 0 and modify only those weights associated with the winning neuron. By doing so, the transfer function of the network is changed such that on subsequent training cycles, the same subtle pattern in the data results in an exaggerated difference at the output. To illustrate this mechanism, suppose that on application of a pattern, the output of neuron 2 won:

Functioning of the competitive network

Figure 2: Functioning of the competitive network

The weights, designated for training, are modified by the following simple formula:

$ω^{+} = ω + η (i n p u t - ω)$

$ω^{+}$ is the modified (trained) weight of $ω$ .
input is the input pattern value associated with that weight to be trained, i.e., I₁ or I₂.
‘ $η$ ’ is the learning rate, nominally set to 1 but can be modified to increase or decrease the magnitude of change implemented on each training loop.

If multiple patterns were applied to the network during training, it can be assumed that sufficiently differing patterns would result in different outputs winning. Via this mechanism, during the training period, different patterns are recognised as such and assigned to different outputs. Similarly, very similar patterns are recognised as such and grouped together on the same output.

Note that this assignment is without programmer intervention, in this sense the network self-organises itself to identify and categorise patterns. A competitive network, therefore, utilises a form of unsupervised learning. By using an unsupervised learning mechanism, the ANN is capable of finding patterns in data that the programmer is unaware of (as they do not need to construct training pairs to feed into the network). A common application of such networks is in iris recognition; whilst human experts know that each of our iris’ is unique, creating software to quickly identify those unique structures within an iris has been historically challenging.

There are various different methods of implementing Competitive Networks and the example given above is only one. However, the format of the weights and inputs are rather critical in all of them and this makes that network rather sensitive to its setup.

Competitive network operation in the vector space

In the simple network illustrated previously, there are two inputs and three neurons. The value of the two inputs can be drawn as a vector in 2D space, with I1 and I2 set upon the x and y axes respectively:

Figure 3: Input vectors on a 2D axis

Note that the magnitude of the Input Vector (L) is given by:

$L = \sqrt{I_{1}^{2} + I_{2}^{2}}$

Next, it should be noted that the two weights associated with a winning neuron can also be placed in the same 2D space:

Figure 4: Weights added to the 2D axis

When we are calculating any given output, in effect we are calculating the dot product of the vectors L and W:

$L \cdot W = (I_{1} \times ω_{I 1 N 2}) + (I_{2} \times ω_{I 2 N 2})$

On our 2D vector space, the dot product represents a measure of how ‘aligned’ two vectors are, i.e., if they are in the same direction – the output is large.

We can add the weight vector for all output neurons onto this graph and the winning output is the one corresponding to that weight vector most closely aligned to the input vector.

Figure 5: Identification of winning weight vector

Note that all vectors have been converted to unit vectors by dividing each vector by its magnitude (i.e., its length):

$\hat{L} = \frac{L}{|L|}$

The result of training is to move the unit vector of the weights (associated with the winning neuron) into closer alignment with the input vector (L).

By doing so, training acts by increasing the likelihood of the same output neuron winning on input of a similar pattern.

Figure 6: Visualising training on a 2D vector space

We shall illustrate this new vector space understanding of competitive network training by creating a simple ANN in MATLAB and plotting the results training.

Programming exercise 6

In this section, we shall build a program to illustrate how the weight vectors are modified during training of a competitive network in a 2D space. In MATLAB we shall model the same basic network that we considered earlier:

Network for programming exercise 6

Figure 7: Network for programming exercise 6

To begin, we shall write a short introduction to identify the program and assign some random numbers between -1 and 1 to each weight.

Note: you may choose to select other weight values.

Notation

w13 refers to the weight between Input I₁ and Neuron: N₃.

%{
…
Doug Rattray – doug.rattray@uhi.ac.uk
student ID – Lews Castle College, UHI
Date
ANN – Section 7 – Programming Exercise 6
…
%}
 
%--------------------------------------------------------%
% Initial Weights
%--------------------------------------------------------%
w11 = 0.6;
w12 = -0.2;
 
w13 = -0.4;
w21 = 0.2;
 
w22 = -0.4;
w23 = 0.8;

We shall then calculate the weight vectors associated with each neuron and convert them to unit vectors.

Notation

w_vec1 refers to the weight vector associated with N₁.
w_vec1u refers to the unit vector associated with N₁.

%--------------------------------------------------------%
% Weight Vectors
%--------------------------------------------------------%
w_vec1 = complex (w11, w21);
w_vec1u = w_vec1 / abs(w_vec1);
 
w_vec2 = complex (w12, w22);
w_vec2u = w_vec2 / abs(w_vec2);
 
w_vec3 = complex (w13, w23);
w_vec3u = w_vec3 / abs(w_vec3);

We shall also calculate the input unit vector for the following input pattern:

[ in1, in2 ] = [ 0.3, -0.8]

%--------------------------------------------------------%
% Input Vectors
%--------------------------------------------------------%
in1 = 0.3;
in2 = -0.8;
 
in_vec = complex (in1, in2);
in_vecu = in_vec / abs(in_vec);

We may now plot all of the (pre-trained) unit vectors on a 2D plot in MATLAB.

Note that the input is identified in black and the weight vectors associated with each output is assigned the following colour:

Neuron 1 – Red
Neuron 2 – Blue
Neuron 3 – Green

%--------------------------------------------------------%
% First Plot (pre-training)
%--------------------------------------------------------%
figure;
compass(real(w_vec1u), imag(w_vec1u), 'red'); hold on
compass(real(w_vec2u), imag(w_vec2u), 'blue');
compass(real(w_vec3u), imag(w_vec3u), 'green');
compass(real(in_vecu), imag(in_vecu), 'black');
hold off;

As we know, the winning output is the one corresponding to that weight vector most closely aligned to the input vector. In this plot, we can therefore predict that the winning output is the blue one, i.e. Neuron 2.

Figure 8: Vector plot (pre-training)

We may now conduct a forward pass through the network and use the max() function to determine which output neuron is the winner.

Remember that once the winner has been determined, its value is set to 1 with all other neurons set to 0.

Note that when running this code, MATLAB outputs the following message as expected:

Neuron 2 wins

%--------------------------------------------------------%
% Forward Pass
%--------------------------------------------------------%
out1 = (in1 * w11) + (in2 * w21);
out2 = (in1 * w12) + (in2 * w22);
out3 = (in1 * w13) + (in2 * w23);
 
%--------------------------------------------------------%
% Assign Winner
%--------------------------------------------------------%
[val, winner] = max([out1, out2, out3]);
if (winner == 1)
    "Neuron 1 wins"
    outputArr = [1, 0, 0];
elseif (winner == 2)
    "Neuron 2 wins"
    outputArr = [0, 1, 0];
elseif (winner == 3)
    "Neuron 3 wins"
    outputArr = [0, 0, 1];
end

Now that a winner has been identified, we shall train only the weights associated with that neuron. Remember that each weight associated with the winning neuron is trained using the following algorithm (we shall use a training rate of 0.3).

$ω^{+} = ω + η (i n p u t - ω)$

Note that I have omitted the code for (winner == 1) and (winner == 2). You should extrapolate and complete these sections.

%--------------------------------------------------------%
% Training
%--------------------------------------------------------%
n = 0.3;
if (winner == 1)
    ...   

 elseif (winner == 2)
    "Training weights for neuron 2"
    w12 = w12 + n*(in1 - w12);
    w22 = w22 + n*(in2 - w22);
    w_vec2 = complex (w12, w22);
   
    w12 = w12 / abs(w_vec2)
    w22 = w22 / abs(w_vec2)
    w_vec2u = complex (w12, w22);
 
elseif (winner == 3)
    ...
 
end

Following this training, we can re-plot all of the (post-trained) unit vectors on a 2D plot in MATLAB.

Note that once again the input is identified in black and the weight vectors associated with each output is assigned the following colour:

Neuron 1 – Red
Neuron 2 – Blue
Neuron 3 – Green

%--------------------------------------------------------%
% Re-plot (post-training)
%--------------------------------------------------------%
figure;
compass(real(w_vec1u), imag(w_vec1u), 'red'); hold on
compass(real(w_vec2u), imag(w_vec2u), 'blue');
compass(real(w_vec3u), imag(w_vec3u), 'green');
compass(real(in_vecu), imag(in_vecu), 'black');
hold off;

Comparing this to the previous plot, it can be seen that the weight vector associated with the winning neuron (N₂) has been modified such that it is more closely aligned with the input vector.

On receipt of a similar pattern, the likelihood of neuron (N₂) winning once again has now increased. In this sense, it can be seen that the network has started to be trained to recognise this pattern.

Figure 9: Vector plot (post-training)

You may wish to experiment with various inputs to the network so that you can see how the network responds to various patterns.

Whilst this network is structured to accept only two inputs it is possible for to visualise the training effect on a two-dimensional plane. If the number of inputs were increased to three, we would need three dimensions to visualise the effect. In practice, the patterns that these networks are trained upon are often vastly larger than just 2 or 3 pixels, therefore it may not be possible to visualise the effect in the manner we programmed above. However, the conceptualisation of vector direction being modified in response to training in a competitive network (and the vector dot product mathematics involved) remains valid.

Advanced competitive networks

The competitive networks illustrated previously are not usually used in such a simple form; more commonly they are implemented as part of large and complex networks. A well-known complex network that implements competitive principles is the ‘Kohonen Self-Organising Map (SOM)’. In Kohonen's network, the neurons are conceptualised as being placed on two-dimensional grid and once a winning neuron is determined, all of the neurons within the locality of the winner are trained (often those further away being trained to a lesser extent as controlled using the training rate ‘η’).

The basic self-organising map

Figure 10: The basic self-organising map

The benefit of a self-organising map is that once the network has been fully trained it groups patterns that are most similar closer together on the grid, thus producing an ordered ‘map’ of patterns.

Other examples of competitive networks include the two-layer Counter-propagation network and the Adaptive Resonance Theory (ART) Network which is a complex network that can grow to accommodate new patterns.

Summary

In summary:

In this section, we introduced the Competitive Network and learned how such a network is trained.
We also learned how to conceptualise such a network and its training effect in 2D space.
In Programming Exercise 6, we created a simple competitive network in MATLAB to demonstrate its function.
Finally, we provided a brief overview of various other forms of more complicated competitive networks, the applications for which you should study in your own time.